AITopics | non-convex stochastic gradient descent

Collaborating Authors

non-convex stochastic gradient descent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Robustness Analysis of Non-Convex Stochastic Gradient Descent using Biased Expectations

Neural Information Processing SystemsDec-24-2025, 13:05:39 GMT

This work proposes a novel analysis of stochastic gradient descent (SGD) for non-convex and smooth optimization.

noise, non-convex stochastic gradient descent, robustness analysis, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.64)

Add feedback

Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent

Neural Information Processing SystemsNov-21-2025, 14:32:45 GMT

Matrix completion, where we wish to recover a low rank matrix by observing a few entries from it, is a widely studied problem in both theory and practice with wide applications. Most of the provable algorithms so far on this problem have been restricted to the offline setting where they provide an estimate of the unknown matrix using all observations simultaneously. However, in many applications, the online version, where we observe one entry at a time and dynamically update our estimate, is more appealing. While existing algorithms are efficient for the offline setting, they could be highly inefficient for the online setting. In this paper, we propose the first provable, efficient online algorithm for matrix completion. Our algorithm starts from an initial estimate of the matrix and then performs non-convex stochastic gradient descent (SGD). After every observation, it performs a fast update involving only one row of two tall matrices, giving near linear total runtime. Our algorithm can be naturally used in the offline setting as well, where it gives competitive sample complexity and runtime to state of the art algorithms. Our proofs introduce a general framework to show that SGD updates tend to stay away from saddle surfaces and could be of broader interests to other non-convex problems.

name change, non-convex stochastic gradient descent, provable efficient online matrix completion, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)

Add feedback

Review for NeurIPS paper: Robustness Analysis of Non-Convex Stochastic Gradient Descent using Biased Expectations

Neural Information Processing SystemsFeb-5-2025, 08:23:10 GMT

Weaknesses: While the "biased expectation" appears to be a powerful tool, the overall results are restricted to the gradients of the algorithm at _some_ time t in the last T iterates. While this is a common outcome of the standard analysis of SGD, it would be nice if (with some additional assumptions on f) the results could be transposed to f(x_t) or x_t within some basin of attraction. The special case of s 0 needs much more detailed treatment. While the authors point out in the supplement that \phi is continuous at s 0, much of the document switches between looking at s- 0 or s 0 without explanation. Assumption 1: I see that the authors need to contol X_t 2 in Thm 1. (Eq.

biased expectation, non-convex stochastic gradient descent, robustness analysis, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

Review for NeurIPS paper: Robustness Analysis of Non-Convex Stochastic Gradient Descent using Biased Expectations

Neural Information Processing SystemsFeb-5-2025, 08:23:03 GMT

After significant discussions with the reviewers, the reviewers were all unanimously in appreciation of the simplicity and cleanliness of the approach presented by the paper. However the authors are strongly encouraged to improve the presentation of the paper - especially the crucial proof of Lemma 1 - multiple steps have been contracted in the presentation and clarifying them is necessary. Furthermore the case of the diminishing step-size scheme is strongly suggested to be fleshed out in theory rather than being left as straightforward extensions. Lastly, the reviewers suggested to use heavier tailed distribution like the Levy distribution to verify the theory better.

biased expectation, non-convex stochastic gradient descent, robustness analysis, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

Reviews: Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent

Neural Information Processing SystemsJan-20-2025, 09:42:22 GMT

The main contribution in this work is showing that in the context of matrix completion, when equipped with a good initialization, the sequence of the solutions produced by a widely used (but without theoretical guarantee) SGD algorithm converges linearly to the true matrix. The paper reads well and most of theoretical result looks sound. But I have some concerns as follows. Is it possible to access fewer samples, e.g., "O(mu * d * k * log(d))" while still ensuring global convergence? Theorem 3.1 says "for any fixed T 1, with probability at least 1 - T/(d 10), we have linear convergence". That means, if we run the algorithm (i.e., repeatedly reveal the samples) for a longer time, we have a LOWER confidence to obtain the true matrix.

non-convex stochastic gradient descent, probability, provable efficient online matrix completion, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.88)

Add feedback

Robustness Analysis of Non-Convex Stochastic Gradient Descent using Biased Expectations

Neural Information Processing SystemsOct-11-2024, 05:55:04 GMT

This work proposes a novel analysis of stochastic gradient descent (SGD) for non-convex and smooth optimization. In the case of sub-Gaussian and centered noise, we prove that, with probability 1-\delta, the number of iterations to reach a precision \varepsilon for the squared gradient norm is O(\varepsilon {-2}\ln(1/\delta)) . In the case of centered and integrable heavy-tailed noise, we show that, while the expectation of the iterates may be infinite, the squared gradient norm still converges with probability 1-\delta in O(\varepsilon {-p}\delta {-q}) iterations, where p,q 2 . This result shows that heavy-tailed noise on the gradient slows down the convergence of SGD without preventing it, proving that SGD is robust to gradient noise with unbounded variance, a setting of interest for Deep Learning. In addition, it indicates that choosing a step size proportional to T {-1/b} where b is the tail-parameter of the noise and T is the number of iterations leads to the best convergence rates.

noise, non-convex stochastic gradient descent, robustness analysis, (8 more...)

Neural Information Processing Systems

Genre: Research Report (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent

Jin, Chi, Kakade, Sham M., Netrapalli, Praneeth

Neural Information Processing SystemsFeb-14-2020, 16:11:42 GMT

algorithm, non-convex stochastic gradient descent, provable efficient online matrix completion, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.94)

Add feedback